Let’s Git Started
Installation
Git installation: Windows
- For using Git/Github and Unix, install
git bashfor windows from:
- Configure git, your name and email address
- In Positron (VScode)
- Open command pallete with
Ctrl + Shift + P - Type “Terminal:Select Default Profile”
- Choose Git Bash for your default terminal
Git for macOS
- Check if Xcode command line tools is not installed (if installed, skip)
- Open terminal
- Configure git, your name and email address
Note
You should also create an account on GitHub. Optionally, register with USF for a student previleges.
Check if terminal understands
gitcommand. Type git on terminal and confirm no error message.
Github Login
Make sure you are able to login to Github.
Introduction to Git/GitHub
Why do we need Git?
Git
Distributed version control system, created by Linus Torvalds (2005)
“Track changes” made by team members, merge into main, etc.
Considered a “must” for software development, and many data science projects.
Has a learning curve, but it’s worth learning even for solo projects.
GitHub
An online hosting platform that is based on Git system
Easy to browse other repositories (public, free)
Others: GitLab, Bitbucket, GitBucket, etc…
Why Git?
Version control: track changes, revert to previous versions
Collaboration: multiple people can work on the same project
Backup: store your project on the cloud
Portfolio: showcase your work to potential employers
Open source: contribute to other projects
Is Git same as GitHub?
Nope.
Git can be done completely locally (w/o internet).
GitHub deploys Git to online cloud system.
Does Git/Github work with any files?
Git is primarily designed to handle text files and tracks changes line by line.
It can upload binary files (e.g. images, pdfs), but doesn’t track differences like text files.
GitHub has a file size limit (100MB)
It is intended for code tracking; large data should be stored elsewhere
Git Terminology/workflow
Git shell commands
Git clone
Usually performed for the first time downloading from the remote repository.
Git init
git init is used to initialize a new git repository from current working directory.
- This command creates a new hidden subdirectory named .git
- If you want to stop git, just remove this folder.
Stage files
git add stages files for commit.
Make commit!
git commit creates your project’s version, or a hash block. You must provide message folling -m.
- As the name suggests, consider committing a serious process of your job, like signing a doc.
Set remote repository
git remote add adds remote repository.
originis naming convention for Github remote repo.
Push to repository
git push sends your commit to the remote repository.
Pull from repository
git pull does two operations altogether:
- fetch: download changes from the remote server
- merge: apply changes to your local file
Git status / log
git status shows the status of changes as untracked, modified, or staged.
git log shows the history of commits and its corresponding hashes.
Git Reset
When you want to get back to previous commit (version) B
git reset B
- Changes made after B are unstaged
- Same as
git reset --mixed B
git reset --soft B
- Changes after B are staged
git reset --hard B
- Match completely to commit B, all others discarded.
In Class Lab
1) My First Commit and Push to Github
- Create your own folder for this class
Name the folder nicely
Create subfolders as:
In R folder, create “my_first_R_code.R” file
- Initialize Git on the folder
and check the status and log on your directory:
- Create your remote repository on Github website:
- Create public repo
- Name your repository nicely
- DO NOT ADD README FILE (leave it unchecked)
- Copy the address somewhere
- It will look like https://github.com/yourGithubId/Reponame.git
- Stage files (add files)
Check your status and see the difference
- Commit the staged version
It creates a “version”, a check-point of your project.
Check your status and log also:
- Name your local, initial branch as “main”
Check your branches from current directory with
- Set remote branch (your github) address as “origin”
Browse your branches now again, and check the differences
- Push your local commit to upstream (remote repo)
Browse your github site, and see files are uploaded.
https://github.com/yourGithubId/Reponame.git
- Share the address to me.
By default, git will only track files, not empty folders. Folders are tracked when files are in it.
2) My Second Commit and Push to Github
Now, make changes to your R\>my_first_R_code.R file:
Write and modify the file:
print("Hello world!")in the code and save.Then add, commit, and push to the github.
- commit message: “My second commit”
Browse history with:
3) Reset changes (Time machine)
Now, suppose your second commit (version) was an error.
You can come back to previous commit by:
or
If you want your remote to be back to the first commit as well:
Note that you should use --force option here.
4) Clone Class folder
- Navigate to the Classroom subfolder
- Clone the class folder to your machine
- I will be uploading class materials here from now on.
For future updates, use
Github and IDE
When using github from other apps (e.g. Positron, VScode, etc), you’ll use PAT (personal access token) instead of password.
- Install package from R and generate token
- Generate token from the page, with no expiration (or long enough)
- Copy the token to clipboard (you won’t see the token again)
- Store it somewhere temporarily
- Use it as your password from your Positron
More Git concepts
Three pull method
By default, git pull is
git pull has three methods:
git pull --mergegit pull --rebasegit pull --ff-only
git merge
Combines changes into a new merge commit. You will need to combine and resolve conflicts.
- B, C are your local change
Example: Git merge scenario
Content of file.txt
Alice edits the second line and commits and push to main first:
Bob also edits the same line, but did not pull Alice’s changes. He edits and commits locally:
When Bob tries to push changes with git push, Git will reject because his local branch is behind the remote.
Bob runs git pull which is git fetch and git merge: git shows the conflicts
The file.txt now contains conflict markers
To resolve conflict, Bob should modify the file manually.
Bob stages this file and commits the resolution:
After resolving, Bob pushes with git push.
git rebase
Git replays your local branch on top of the remote branch, creating a linear history:
Example: Git rebase scenario
Content of file.txt
Alice edits the second line and commits and push to main first:
Bob also edits the same line, but did not pull Alice’s changes. He edits and commits locally:
When Bob tries to push changes with git push, Git will reject because his local branch is behind the remote.
Bob runs git pull --rebase which is git fetch and git rebase:
Instead of creating a merge commit,
git rewinds Bob’s local commit C
applies Alice’s changes from remote (origin/main) B
reapplies C on top of B
When there’s no conflict, (i.e., they did not change the same line of code) git pull –-rebase won’t raise any conflict message.
However, in our scenario, there’s conflict since both modified the same line, so:
rebase is halted
should be continued after conflict resolution
Similar to merge, conflict should be resolved manually.
The file.txt now contains conflict markers
To resolve conflict, Bob should modify the file manually.
Bob stages this file and continues the rebase (no commit)
After that Bob pushes with git push. The history is linear.
git merge --ff-only
Used when you expect no conflicts.
When you want linear history and want to move your local branch pointer forward without modifying.
- Git rejects if there is any conflict
.gitignore
Git keep tracks all changes within the project directory
New file, folder
Modifications (changes or updates)
Deletion
If there are files/folder you don’t want to track, specify them in .gitignore file.
git clean
If you want to remove untracked files: